Are Latent Sentence Vectors Cross-Linguistically Invariant?

نویسنده

  • Michael Hahn
چکیده

Previous work [Bowman et al., 2016] has shown that variational autoencoders (VAEs) can create distributed representations of natural language that capture different linguistic levels such as syntax, semantics, and style in a holistic manner. I investigate to what extent VAEs, when trained on different languages, result in comparable representations. To this end, I train VAEs for English and French, and then train a transformation between the resulting latent spaces on the task of machine translation. An analysis of the resulting mapping from French to English sentences shows that the latent representations represent the presence of words, phrases, and the general topic. However, I do not find evidence that they also encode syntax and semantics in a cross-linguistically invariant manner.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Text Generation and Planning for Strategic Dialogue

End-to-end models for strategic dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors. We introduce an approach to generating latent representations of dialogue moves, by inducing sentence representations to maximize the likelihood of subsequent sentences and actions. The effect is to decouple much of the semantics of the utterance fro...

متن کامل

TCDSCSS: Dimensionality Reduction to Evaluate Texts of Varying Lengths - an IR Approach

This paper provides system description of the cross-level semantic similarity task for the SEMEVAL-2014 workshop. Crosslevel semantic similarity measures the degree of relatedness between texts of varying lengths such as Paragraph to Sentence and Sentence to Phrase. Latent Semantic Analysis was used to evaluate the cross-level semantic relatedness between the texts to achieve above baseline sco...

متن کامل

SOLUTION-SET INVARIANT MATRICES AND VECTORS IN FUZZY RELATION INEQUALITIES BASED ON MAX-AGGREGATION FUNCTION COMPOSITION

Fuzzy relation inequalities based on max-F composition are discussed, where F is a binary aggregation on [0,1]. For a fixed fuzzy relation inequalities system $ A circ^{F}textbf{x}leqtextbf{b}$, we characterize all matrices $ A^{'} $ For which the solution set of the system $ A^{' } circ^{F}textbf{x}leqtextbf{b}$ is the same as the original solution set. Similarly, for a fixed matrix $ A $, the...

متن کامل

Three Sensitive Positions and Chinese Complex Sentences: A Comparative Perspective

The positioning of sentential connectives in Chinese complex sentences is more flexible than their counterparts in English. Sentential connectives in Chinese can be placed in three sensitive positions: clause-initial, predicate-initial, and clause-final positions. Due to the co-existence of prepositions and postpositions in the language, sentential connectives can be placed in both clause-initi...

متن کامل

The Role of Conceptualizable Agent in Overpassivization of English Unaccusatives in Iranian English Majors

The present study is an attempt to explore the effect of one of the pragmatic elements of discourse (namely the conceptualizable agent) on overpassivization of English unaccusative verbs. Through employing the questionnaire originally used by Ju, (2000), 206 Iranian intermediate and advanced English majors were asked to choose the more grammatical form (active or passive) in target sentences wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017